Semantic Anonymisation of Set-valued Data

نویسندگان

  • Montserrat Batet
  • Arnau Erola
  • David Sánchez
  • Jordi Castellà-Roca
چکیده

It is quite common that companies and organisations require of releasing and exchanging information related to individuals. Due to the usual sensitive nature of these data, appropriate measures should be applied to reduce the risk of re-identification of individuals while keeping as much data utility as possible. Many anonymisation mechanisms have been developed up to present, even though most of them focus on structured/relational databases containing numerical or categorical data. However, the anonymisation of transactional data, also known as set-valued data, has received much less attention. The management and transformation of these data presents additional challenges due to their variable cardinality and their usually textual and unbounded nature. Current approaches focusing on set-valued data are based on the generalisation of original values; however, this suffers from a high information loss derived from the reduced granularity of the output values. To tackle this problem, in this paper we adapt a well-known microaggregation anonymisation mechanism so that it can be applied to textual set-valued data. Moreover, since the utility of textual data is closely related to their meaning, special care has been put in preserving data semantics. To do so, appropriate semantic similarity and aggregation functions are proposed. Experiments conducted on a real set-valued data set show that our proposal better preserves data utility in comparison with non-semantic approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Anonymisation of Categorical Datasets

The exploitation of microdata compiled by statistical agencies is of great interest for the data mining community. However, such data often include sensitive information that can be directly or indirectly related to individuals. Hence, an appropriate anonymisation process is needed to minimise the risk of disclosing identities and/or confidential data. In the past, many anonymisation methods ha...

متن کامل

Semantic attack on transaction data anonymised by set-based generalisation

Publishing data that contains information about individuals may lead to privacy breaches. However, data publishing is useful to support research and analysis. Therefore, privacy protection in data publishing becomes important and has received much recent attention. To improve privacy protection, many researchers have investigated how secure the published data is by designing de-anonymisation me...

متن کامل

Towards the Anonymisation of RDF Data

Privacy protection in published data sets is of crucial importance, and anonymisation is one well-known technique for privacy protection that has been successfully used in practice. However, existing anonymisation frameworks have in mind specific data structures (i.e., tabular data) and, because of this, these frameworks are difficult to apply in the case of RDF data. This paper presents an RDF...

متن کامل

Anonymising Research Data

This document outlines some thoughts and discussions we have been having about strategies of anonymisation of data to be collected through the ESRC / NCRM Real Life Methods Node Connected Lives project 1. It is commonplace for social science research to adopt a policy of 'blanket anonymisation', whereby all names, places and other identifying features are disguised across a data set, including ...

متن کامل

Some Results about Set-Valued Complementarity Problem

This paper is devoted to consider the notions of complementary problem (CP) and set-valued complementary problem (SVCP). The set-valued complementary problem is compared with the classical single-valued complementary problem. Also, the solution set of the set-valued complementary problem is characterized. Our results illustrated by some examples. This paper is devoted to co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014